Add pinned post support to feed generator#149
Merged
Conversation
The bulk insert paths for posts and follows were not updating profile_agg counters (postsCount, followsCount, followersCount). This caused profiles to show 0 counts despite having posts/follows. Added profile_agg updates after bulk inserts: - copy_insert_posts: Update postsCount for creators - copy_insert_follows: Update followsCount for creators, followersCount for subjects
The quote table was missing sortAt which is required by the getQuotes dataplane route for pagination.
The posts_with_media filter in getAuthorFeed was returning empty results because the wintermute indexer was not populating the post_embed_image and post_embed_video tables. Changes: - Update handle_post_embeds() to detect and process image/video embeds - Add handle_embed_images() and handle_embed_video() for single-record indexing - Add extract_embed_data(), extract_images(), extract_video() for bulk processing - Add copy_insert_post_embed_images() and copy_insert_post_embed_videos() bulk functions using COPY protocol - Update copy_batch_insert_posts() to extract and insert embed data This fixes the media tab showing empty on user profiles. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Two migration scripts: - backfill_post_embeds.sql: Single-shot migration for smaller datasets - backfill_post_embeds_batched.sql: Batched approach for large tables Run after deploying the indexer fix to populate post_embed_image and post_embed_video for existing posts. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add ON CONFLICT DO NOTHING to all notification INSERT statements: - like notifications (was missing) - follow notifications (was missing) - repost notifications (was missing) - starterpack-joined notifications (was missing) - reply and quote already had it - Add migration scripts: - dedupe_notifications.sql: Full deduplication for large tables - add_notification_unique_constraint.sql: Adds unique constraint - Fix clippy warnings: - Use ToOwned::to_owned instead of closure - Use map_or_else instead of match for option handling
Implements app.bsky.video.* lexicon endpoints using Bunny Stream for video transcoding and CDN delivery. Features: - getUploadLimits: Check user quotas - uploadVideo: Upload video to Bunny Stream - getJobStatus: Poll transcoding status - Bunny webhook handler for encoding completion - URL proxy mapping did/cid to Bunny video IDs Configuration: - BUNNY_LIBRARY_ID, BUNNY_API_KEY, BUNNY_PULL_ZONE - VIDEO_SERVICE_DID, VIDEO_PUBLIC_URL - DATABASE_URL for job tracking
Previously the video service was using Bunny's UUID as the blob $link, which is invalid per AT Protocol spec. CIDs must be content-addressed hashes (e.g., bafkreibjfgx2gprinfvicegelk5kosd6y2frmqpqzwqkg7usac74l3t2v4). Changes: - Add CID generation using cid and multihash-codetable crates - Generate CIDv1 (raw codec 0x55, SHA-256) from video bytes during upload - Store video_cid in job record for later use - Use proper CID in blob reference when webhook completes job - Update video_mappings to use content CID instead of Bunny UUID This ensures video blobs have valid content-addressed identifiers that comply with the AT Protocol specification.
Implements proper AT Protocol blob flow: - Video service now uploads blob to user's PDS first using forwarded service auth token - PDS returns valid blob_ref which is stored in video_jobs.pds_blob_ref - Client can then reference the blob in posts without BlobNotFound errors Key changes: - Add pds/ module using atrium for AT Protocol operations - Add pds_blob_ref column to video_jobs table - Extract PDS DID from token's aud claim - Resolve did:web directly, did:plc via plc.directory - Upload blob to PDS, then to Bunny for transcoding
Video service now has its own identity (did:web:video.blacksky.community) and creates service auth tokens to upload blobs to user PDSs. Changes: - Add signing module with K-256 JWT signing - Load signing key from SIGNING_KEY_PATH env var - Create service auth tokens with iss=video_service, aud=pds, sub=user - Update PDS client to resolve user DID to their PDS endpoint - Update Cargo.toml with k256 and sec1 dependencies
The atrium XRPC client was not properly forwarding the Authorization header, resulting in 'Bearer did:plc:...' being sent instead of the actual JWT token. Changed to direct reqwest HTTP calls with explicit headers for blob upload.
The PDS couldn't verify tokens signed by the video service because it doesn't resolve did:web DIDs for external services. New approach: client requests service auth token with aud=pds_did (not video_service_did), and we forward that token directly to the PDS. The PDS can verify it since it's signed by the user's own signing key.
Previously wintermute only processed #commit events from the firehose, ignoring identity changes, account status updates, and sync events. Changes: - Add IdentityData and AccountData types to FirehoseEvent - Parse #identity events and update actor handles via DID resolution - Parse #account events and update actor upstream_status - Parse #sync events and refresh handles (like identity events) - Update all tests to include new FirehoseEvent fields This fixes handle changes not being reflected in the appview.
Changed ON CONFLICT DO NOTHING to ON CONFLICT DO UPDATE for records that can be legitimately updated by users: - profile: displayName, description, avatarCid, bannerCid - feed_generator: displayName, description, avatarCid - list: name, description, avatarCid - starter_pack: name Also fixed batch_insert_profiles to include avatarCid and bannerCid columns which were previously missing. This fixes the bug where profile updates (like changing avatar/bio) were not being reflected in the appview because the original record was kept due to ON CONFLICT DO NOTHING.
Updated all 6 notification INSERT statements to use: ON CONFLICT (did, "recordUri", reason) DO NOTHING This prevents duplicate notifications when the same event is processed multiple times (e.g., from live firehose and backfill). Requires adding unique index on notification table: CREATE UNIQUE INDEX notification_unique_idx ON notification (did, "recordUri", reason);
The ON CONFLICT (did, recordUri, reason) clause requires a unique index to exist, otherwise PostgreSQL throws an error. Reverting to ON CONFLICT DO NOTHING until the unique index can be created during a maintenance window.
Added ON CONFLICT (did, recordUri, reason) DO NOTHING to all 6 notification INSERT statements. Works with the notification_unique_idx index that prevents duplicate notifications from being created. This fixes the issue where the same notification could appear multiple times with the same indexedAt timestamp due to parallel processing or retries.
This reverts commit 6a55352.
Changed snake_case to camelCase to match PostgreSQL schema: - indexed_at -> indexedAt - upstream_status -> upstreamStatus
Changed from ON CONFLICT ON CONSTRAINT actor_block_unique_subject to ON CONFLICT (creator, subjectDid) because there are two unique constraints on the same columns and the insert was hitting the other one.
The table has multiple unique constraints (uri PK, plus two on creator/subjectDid). ON CONFLICT DO NOTHING handles conflicts on any of them.
Changed all 6 notification INSERT statements to use ON CONFLICT (did, "recordUri", reason) DO NOTHING instead of just ON CONFLICT DO NOTHING. This requires the unique index notification_unique_idx to exist on the notification table with columns (did, "recordUri", reason).
Previously only the direct parent author received a notification for replies. Bluesky also notifies the thread root author (the person who started the thread) for any reply in their thread, up to 5 levels deep. This change adds a notification to the root post author when: - The reply has a root that differs from the parent (nested reply) - The root author is not the same as the post creator This matches Bluesky's behavior where users see notifications for replies anywhere in threads they started, not just direct replies.
Replaces the simple parent+root reply notification with the official Bluesky behavior from post.ts notifsForInsert: 1. Mention notifications: Parse post facets and create notifications for app.bsky.richtext.facet#mention features. Previously mentions were not generating any notifications. 2. Reply ancestor walk: Use recursive CTE to walk up the thread ancestor chain up to REPLY_NOTIF_DEPTH (5 levels), notifying each ancestor author. This matches the official behavior where users get notified for replies anywhere in their thread, not just direct replies. 3. Descendant notifications for out-of-order indexing: When a post in the middle of a thread is indexed after its replies, notify ancestors about existing descendant replies. Uses recursive CTE to find descendants, then cross-products with ancestors where depth + height < REPLY_NOTIF_DEPTH. Deduplication is handled by ON CONFLICT (did, recordUri, reason) DO NOTHING on all notification inserts.
Tracing mention notification inserts to debug missing mention notifications in production.
The label ingestion pipeline was missing the neg field entirely, causing negation labels (neg: true) to be stored as neg: false. This meant labels that Bluesky removed were never actually negated in our database. Changes: - Add neg, cid, exp fields to Label and RawLabel structs - Parse neg from CBOR label messages (defaults to false if absent) - Update indexer INSERT to use actual neg value instead of hardcoded false - ON CONFLICT now updates neg, cts, and exp (matching TS dataplane) - Log negation labels at info level for visibility - Add test_label_negation and test_parse_label_message_with_negation - Fix clippy if_not_else lint in mention notification code
… app.bsky.verification.proof The indexer was matching on a nonexistent collection type app.bsky.verification.proof instead of the correct app.bsky.graph.verification lexicon. This caused all verification records from the firehose to be silently ignored, leaving the verification table empty. Also fixed the URI format strings in index_verification and delete_verification to use the correct collection path.
Tracks remaining items: - Hydration-time CID verification in client - Firehose listener for orphaned content cleanup - Community post threadgate support - Community feed aggregation - Content expiration policy
Move all rsky-video SQL queries to reference the videos schema (videos.video_jobs, videos.upload_quotas, videos.video_mappings). Add CREATE SCHEMA IF NOT EXISTS videos to migrations. Includes cargo fmt formatting fixes across rsky-video.
When PINNED_POST_URI is set, the configured post is inserted at position 0 of the feed on first-page requests only (no cursor). Paginated scroll requests are unaffected. Banned users continue to see only the banned notice post.
Notification inserts were using .await? which caused the entire indexing function to bail out on failure. This prevented post_agg, profile_agg, and feed_agg updates from running. Changed all 8 notification INSERT sites to log warnings instead of propagating errors, ensuring aggregate count updates always complete. Root cause: notification_id_seq hit 32-bit integer max (2147483647), causing all notification inserts to fail and cascading to block all aggregate count updates across the appview.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
PINNED_POST_URIenvironment variable to configure a pinned postTest plan
PINNED_POST_URIto a valid AT URI and verify it appears first on initial feed loadPINNED_POST_URIand verify feeds behave as before